146 research outputs found

    The MEROPS Database

    Get PDF
    Many proteins undergo important post-translational proteolytic processing to remove targeting signals and activation peptides, and most proteins undergo proteolytic inactivation and catabolism. The enzymes that hydrolyse the peptide bonds in proteins and peptides are known as peptidases, proteases or proteolytic enzymes. The MEROPS database ("http://merops.sanger.ac.uk":http://merops.sanger.ac.uk) presents the classification and nomenclature of peptidases, their inhibitors and substrates. In 1993 we proposed the scheme for the classification of peptidases that has been internationally accepted, and in 1996 we established the MEROPS database. Protein inhibitors have been included in the database since 2004. About 2% of the genes in a genome encode peptidase homologues, and a further 1% encode protein inhibitors. For example, the human genome has 1037 genes encoding peptidase homologues (of which 643 are known or predicted to be active peptidases) and 433 protein inhibitor genes (of which 144 have been biochemically characterized as inhibitors). 

The MEROPS classification is hierarchical. Sequences are grouped into a peptidase species (each of which is given a unique identifier, for example C01.060 for cathepsin B); peptidase species are grouped into a family (for example C1); and families grouped into a clan (for example CA). To be included in the same protein species, sequences must be derived from the same node on a dendrogram derived from the family sequence alignment and known (or predicted) to share similar specificity. To be included in the same family sequences must be homologous over the sequence domain that contains the active site residues (peptidases) or reactive site (inhibitors). To be included in the same clan, the proteins must share similar tertiary structures (or the same linear arrangement of active site residues if the structure is unknown). Over 117,000 peptidase homologues are classified into 3114 protein species, 205 families and 52 clans, and 12,104 protein inhibitors are classified into 663 protein species, 64 families and 33 clans.

The database includes manually curated summaries for each clan, family and protein species. There are also sequence alignments and manually curated bibliographies (with over 41,000 references) at every level. In addition to protein inhibitors we also include 158 manually curated summaries for synthetic and naturally occurring small molecule inhibitors. There is also a summary page for each organism listing all known homologues and an analysis highlighting significant presences, absences or gene family expansions for organisms with a completely sequenced genome. 

The MEROPS database includes known peptidase substrates: naturally occurring peptides and proteins, and synthetic substrates. Currently there are 4091 cleavages of synthetic substrates and 95,413 cleavages of proteins (of which 74,740 are physiological). Cleavages in proteins are mapped to UniProt entries. An alignment of very close homologues of each substrate sequence is shown, highlighting residues around each cleavage site indicating whether the peptidase is known to accept the amino acid at that position or not. Cleavage sites that are conserved are likely to be physiological; cleavage sites that are not conserved may be pathological for the species in which they occur or coincidental.

The MEROPS data is freely available to download from our FTP site ("http://ftp.sanger.ac.uk/pub/MEROPS":http://ftp.sanger.ac.uk/pub/MEROPS) and via our Distributed Annotation System (DAS) server ("http://das.sanger.ac.uk/das/merops":http://das.sanger.ac.uk/das/merops).
&#xa

    Pepsin homologues in bacteria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Peptidase family A1, to which pepsin belongs, had been assumed to be restricted to eukaryotes. The tertiary structure of pepsin shows two lobes with similar folds and it has been suggested that the gene has arisen from an ancient duplication and fusion event. The only sequence similarity between the lobes is restricted to the motif around the active site aspartate and a hydrophobic-hydrophobic-Gly motif. Together, these contribute to an essential structural feature known as a psi-loop. There is one such psi-loop in each lobe, and so each lobe presents an active Asp. The human immunodeficiency virus peptidase, retropepsin, from peptidase family A2 also has a similar fold but consists of one lobe only and has to dimerize to be active. All known members of family A1 show the bilobed structure, but it is unclear if the ancestor of family A1 was similar to an A2 peptidase, or if the ancestral retropepsin was derived from a half-pepsin gene. The presence of a pepsin homologue in a prokaryote might give insights into the evolution of the pepsin family.</p> <p>Results</p> <p>Homologues of the aspartic peptidase pepsin have been found in the completed genomic sequences from seven species of bacteria. The bacterial homologues, unlike those from eukaryotes, do not possess signal peptides, and would therefore be intracellular acting at neutral pH. The bacterial homologues have Thr218 replaced by Asp, a change which in renin has been shown to confer activity at neutral pH. No pepsin homologues could be detected in any archaean genome.</p> <p>Conclusion</p> <p>The peptidase family A1 is found in some species of bacteria as well as eukaryotes. The bacterial homologues fall into two groups, one from oceanic bacteria and one from plant symbionts. The bacterial homologues are all predicted to be intracellular proteins, unlike the eukaryotic enzymes. The bacterial homologues are bilobed like pepsin, implying that if no horizontal gene transfer has occurred the duplication and fusion event might be very ancient indeed, preceding the divergence of bacteria and eukaryotes. It is unclear whether all the bacterial homologues are derived from horizontal gene transfer, but those from the plant symbionts probably are. The homologues from oceanic bacteria are most closely related to memapsins (or BACE-1 and BACE-2), but are so divergent that they are close to the root of the phylogenetic tree and to the division of the A1 family into two subfamilies.</p

    MEROPS: the peptidase database

    Get PDF
    Peptidases (proteolytic enzymes) and their natural, protein inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database () aims to fulfil the need for an integrated source of information about these proteins. The organizational principle of the database is a hierarchical classification in which homologous sets of proteins of interest are grouped into families and the homologous families are grouped in clans. The most important addition to the database has been newly written, concise text annotations for each peptidase family. Other forms of information recently added include highlighting of active site residues (or the replacements that render some homologues inactive) in the sequence displays and BlastP search results, dynamically generated alignments and trees at the peptidase or inhibitor level, and a curated list of human and mouse homologues that have been experimentally characterized as active. A new way to display information at taxonomic levels higher than species has been devised. In the Literature pages, references have been flagged to draw attention to particularly ‘hot’ topics

    A comparison of Pfam and MEROPS: Two databases, one comprehensive, and one specialised.

    Get PDF
    BACKGROUND: We wished to compare two databases based on sequence similarity: one that aims to be comprehensive in its coverage of known sequences, and one that specialises in a relatively small subset of known sequences. One of the motivations behind this study was quality control. Pfam is a comprehensive collection of alignments and hidden Markov models representing families of proteins and domains. MEROPS is a catalogue and classification of enzymes with proteolytic activity (peptidases or proteases). These secondary databases are used by researchers worldwide, yet their contents are not peer reviewed. Therefore, we hoped that a systematic comparison of the contents of Pfam and MEROPS would highlight missing members and false-positives leading to improvements in quality of both databases. An additional reason for carrying out this study was to explore the extent of consensus in the definition of a protein family. RESULTS: About half (89 out of 174) of the peptidase families in MEROPS overlapped single Pfam families. A further 32 MEROPS families overlapped multiple Pfam families. Where possible, new Pfam families were built to represent most of the MEROPS families that did not overlap Pfam. When comparing the numbers of sequences found in the overlap between a MEROPS family and its corresponding Pfam family, in most cases the overlap was substantial (52 pairs of MEROPS and Pfam families had an intersection size of greater than 75% of the union) but there were some differences in the sets of sequences included in the MEROPS families versus the overlapping Pfam families. CONCLUSIONS: A number of the discrepancies between MEROPS families and their corresponding Pfam families arose from differences in the aims and philosophies of the two databases. Examination of some of the discrepancies highlighted additional members of families, which have subsequently been added in both Pfam and MEROPS. This has led to improvements in the quality of both databases. Overall there was a great deal of consensus between the databases in definitions of a protein family

    Identification of the active site of legumain links it to caspases, clostripain and gingipains in a new clan of cysteine endopeptidases

    Get PDF
    AbstractWe show by site-directed mutagenesis that the catalytic residues of mammalian legumain, a recently discovered lysosomal asparaginycysteine endopeptidase, form a catalytic dyad in the motif His-Gly-spacer-Ala-Cys. We note that the same motif is present in the caspases, aspartate-specific endopeptidases central to the process of apoptosis in animal cells, and also in the families of clostripain and gingipain which are arginyl/lysyl endopeptidases of pathogenic bacteria. We propose that the four families have similar protein folds, are evolutionarily related in clan CD, and have common characteristics including substrate specificities dominated by the interactions of the S1 subsite

    A Primitive Enzyme for a Primitive Cell: The Protease Required for Excystation of Giardia

    Get PDF
    AbstractProtozoan parasites of the genus Giardia are one of the earliest lineages of eukaryotic cells. To initiate infection, trophozoites emerge from a cyst in the host. Excystation is blocked by specific cysteine protease inhibitors. Using a biotinylated inhibitor, the target protease was identified and its corresponding gene cloned. The protease was localized to vesicles that release their contents just prior to excystation. The Giardia protease is the earliest known branch of the cathepsin B family. Its phylogeny confirms that the cathepsin B lineage evolved in primitive eukaryotic cells, prior to the divergence of plant and animal kingdoms, and underscores the diversity of cellular functions that this enzyme family facilitates

    LUD, a new protein domain associated with lactate utilization.

    Get PDF
    BackgroundA novel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and LutC. Both proteins are encoded by a highly conserved LutABC operon, which has been implicated in lactate utilization in bacteria. Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 as the LUD domain family.ResultsJCSG solved the first crystal structure [PDB:2G40] from the LUD domain family: LutC protein, encoded by ORF DR_1909, of Deinococcus radiodurans. LutC shares features with domains in the functionally diverse ISOCOT superfamily. We have observed that the LUD domain has an increased abundance in the human gut microbiome.ConclusionsWe propose a model for the substrate and cofactor binding and regulation in LUD domain. The significance of LUD-containing proteins in the human gut microbiome, and the implication of lactate metabolism in the radiation-resistance of Deinococcus radiodurans are discussed

    Structural Analysis of Papain-Like NlpC/P60 Superfamily Enzymes with a Circularly Permuted Topology Reveals Potential Lipid Binding Sites

    Get PDF
    NlpC/P60 superfamily papain-like enzymes play important roles in all kingdoms of life. Two members of this superfamily, LRAT-like and YaeF/YiiX-like families, were predicted to contain a catalytic domain that is circularly permuted such that the catalytic cysteine is located near the C-terminus, instead of at the N-terminus. These permuted enzymes are widespread in virus, pathogenic bacteria, and eukaryotes. We determined the crystal structure of a member of the YaeF/YiiX-like family from Bacillus cereus in complex with lysine. The structure, which adopts a ligand-induced, “closed” conformation, confirms the circular permutation of catalytic residues. A comparative analysis of other related protein structures within the NlpC/P60 superfamily is presented. Permutated NlpC/P60 enzymes contain a similar conserved core and arrangement of catalytic residues, including a Cys/His-containing triad and an additional conserved tyrosine. More surprisingly, permuted enzymes have a hydrophobic S1 binding pocket that is distinct from previously characterized enzymes in the family, indicative of novel substrate specificity. Further analysis of a structural homolog, YiiX (PDB 2if6) identified a fatty acid in the conserved hydrophobic pocket, thus providing additional insights into possible function of these novel enzymes
    corecore